home *** CD-ROM | disk | FTP | other *** search
- >> The arguments that in-band designation of document format is better
- >> than out-of-band information may apply in the electronic mail
- >> scenarios, where there is a single sender, multiple recipients, and
- >> the recipient has no control over what the sender might send.
-
- >The argument is identical for most file servers, which have even less control
- >over the specifics of what files they offer for retrieval. File servers usually
- >rely on contributed material and only rarely have anything resembling precise
- >control over the material they offer.
-
- But we are not discussing 'file servers' in general, but something
- more specific and presumably over which we have more control: use of
- MIME content identifiers to identify content-type in World-Wide-Web
- and WAIS servers. Even in the case of file servers, while you might
- not have control over the material offered, you do have control over
- the description of that material as to which version of a purported
- standard format the material might be in, and even, in some cases,
- which profile of that standard might apply.
-
- >> If I wish to retrieve the document, say to view it, I might want to
- >> choose the available representation that is most appropriate for my
- >> purpose. Imagine my dismay to retrieve a 50 megabyte postscript file
- >> from an anonymous FTP archive, only to discover that it is in the
- >> newly announced Postscript level 4 format, or to try to edit it only
- >> to discover that it is in the (upwardly compatible but not parsable by
- >> my client) version 44 of Rich Text. In each case, the appropriateness
- >> of alternate sources and representations of a document would depend on
- >> information that is currently only available in-band.
-
- >Even if this happens (I have strong doubts that it will since documents made
- >available for public retrieval tend to converge rapidly to lowest-common
- >denominator usage) you have failed to propose an alternative that solves this
- >usefully.
-
- Documents made available for public retrieval do not cannot 'tend to
- converge rapidly to lowest-common denominator usage', because *old
- documents do not go away*! If there is diversity today in the
- available formats for RFCs, tech reports and PhD theses, that
- diversity can only get worse! It is foolish to think that the
- diversity will diminish any time in the near future; certainly the
- number of 'conference proceedings on CD-rom' is increasing, as people
- want to share Mathematica documents, various forms of hypertext, audio
- content and the like.
-
- As for a proposal that 'solves this usefully', I have a fairly mild
- proposal that, while it does not solve all of the problems in
- interoperability, does reduce the amount of uncertainty:
-
- I propose (once again) that instead of saying 'application/postscript'
- it say, at a minimum, 'application/postscript 1985' vs
- 'application/postscript 1994' or whatever you would like to designate
- as a way to uniquely identify which edition of the Postscript
- reference manual you are talking about; instead of being identified as
- 'image/tiff' the files be identified as 'image/tiff 5.0 Class F' vs
- 'image/tiff 7.0 class QXB'.
-
- > Finally, let me point out that I speak as one of the maintainers of one of the
- > largest archive of TeX material available anywhere. This material has been
- > available via MIME-compliant mail server (and of course FTP) for over six
- > months now. This archive contains hundreds of PostScript documents as well
- > as all sorts of other stuff. The problems you seem to think are endemic to
- > this sort of services have yet to materialize.
-
- I think you need to take a longer-term and broader perspective than a
- six-month experience with a single representation of document.
-
-
- We've been developing a document archive service that can cope with 20
- years of collected electronic documents. We have not only Postscript 1
- and 2, but also several versions of Interpress, and Press format, two
- versions of DVI, revisable formats of 20 years of editor development
- -- several versions of tex, latex, framemaker, microsoft word, tioga,
- globalview, viewpoint, bravo, bravox, tedit, troff, interleaf,
- wordperfect, etc, and images in multiple variations of RES, AIS, TIFF,
- sun raster, pcx, macpaint, ad nauseum.
-
- In trying to deal with a documents over the longer term, it has become
- apparent that merely marking documents with a simple 'format' tag like
- 'interpress' or 'postscript' or 'tiff' isn't adequate for most
- purposes. Standards evolve over as short as a 5 year period; even the
- method of internal tagging standard versions changes, and certainly,
- it is impossible to rely on in-band version information for all
- formats.
-
- I have more to say about the problem of 'external references' but I'll
- save that for another message.
-
- It would be nice to have a calm discussion about possible solutions to
- these problems & hope you will forgo future sarcasm.
-
- Thanks,
-
- Larry
-
-